Skip to content

fix: batch resolve PostHog production errors (network-transport noise + auto-resolve stale)#351

Closed
JeffOtano wants to merge 2 commits into
mainfrom
fix/posthog-batch-2026-05-09
Closed

fix: batch resolve PostHog production errors (network-transport noise + auto-resolve stale)#351
JeffOtano wants to merge 2 commits into
mainfrom
fix/posthog-batch-2026-05-09

Conversation

@JeffOtano
Copy link
Copy Markdown
Owner

@JeffOtano JeffOtano commented May 9, 2026

Summary

Triaged the top 50 active PostHog issues (last 30 days) for project Roni / 360337. The vast majority were already covered by the deployed posthogBeforeSend filter (commit 5b397e7, 2026-05-04) but show up in PostHog because the filter only stops new events; it does not auto-resolve historical issues. The only post-deploy noise still slipping through was unactionable browser-network failures originating from Convex transport on flaky user connections.

This PR closes that gap by extending both posthogBeforeSend and sentryBeforeSend to suppress Failed to fetch (Chromium/Safari) and NetworkError when attempting to fetch resource (Firefox). These TypeErrors are thrown by the browser's fetch() implementation when it cannot reach the server (offline, DNS, TLS, connection drop) — they originate before any application code runs. Convex transport already auto-reconnects with a UI indicator, so they are not actionable application bugs.

Fixes in this PR

PostHog issue File Occurrences Users Fix summary Specialist used
019dfc33 NetworkError src/lib/posthogBeforeSend.ts + src/lib/sentryBeforeSend.ts 3 1 Add NetworkError when attempting to fetch resource to suppression lists direct (2-line + tests)
019dfa87 Failed to fetch same 2 1 Add Failed to fetch to suppression lists direct
019dd58e Failed to fetch same 2 2 Same direct
019d4a22 NetworkError ... chatty-hawk-29 same 13 1 Same direct
019d982a Failed to fetch same 1 1 Same direct
019da185 Failed to fetch same 1 1 Same direct

Test plan

  • npx vitest run src/lib/posthogBeforeSend.test.ts src/lib/sentryBeforeSend.test.ts — 50 passed
  • npm test — 1673 passed / 13 skipped
  • npx tsc --noEmit — clean
  • npm run lint — clean
  • New regression tests added: drops Firefox NetworkError fetch failures (Convex transport offline) and drops Chromium/Safari Failed to fetch transport failures in both filter test files.

Auto-resolved by deployed filter (no code change)

The following issues are already suppressed by the existing posthogBeforeSend / sentryBeforeSend filters but were never marked resolved. They will be closed in PostHog after merge per the task's Step 9. Their last_seen timestamps all predate the filter deploy on 2026-05-04, confirming the deployed filter is working.

PostHog issue Description Last seen Why already filtered
019d510a "function call turn comes immediately after" + Gemini quota events 2026-04-28 function call turn comes immediately after, exceeded your current quota
019d7fcc, 019d7fd4, 019d8244, 019d8337 [CONVEX A(chat:createThreadWithMessage)] Server Error 2026-04-12..14 Pre-dates BYOK validation reorganization in commit 48a134c; not recurring
019d7fd4-619c, 019d8244, 019d84a3, 019d8842, 019d8771 Failed after 3 attempts. ... quota / high demand 2026-04-12..19 exceeded your current quota, model is currently experiencing high demand
019d88ad, 019da66b, 019d8ddb, 019d9493, 019d930b, 019d916e Gemini quota messages 2026-04-13..22 exceeded your current quota
019dcf31, 019db06b, 019d9197, 019db5af Gemini high-demand 2026-04-15..27 model is currently experiencing high demand
019dea0c Gemini prepayment credits depleted 2026-05-02 credits are depleted
019d96f1 Anthropic credit balance too low 2026-04-17 Pre-dates filter; one-off, not recurring
019d4119 "Script error." cross-origin 2026-04-26 Script error.
019d4c7b Chrome extension runtime.sendMessage 2026-04-26 Invalid call to runtime.sendMessage
019da074 minified n.standardSelectors 2026-04-18 n.standardSelectors
019db793 benign ResizeObserver loop 2026-04-22 ResizeObserver loop
019d8488* / 019d8489* / 019d858a (8 issues) Firefox iOS reader-mode __firefox__.reader / __firefox__ 2026-04-12..13 __firefox__.reader

All of the above are also listed in sentryBeforeSend.ts with the same suppression strings, so they will not generate further events.

Skipped

  • Chunk load errors (019d8795, 019d7950) — 2 + 1 occurrences. Next.js handles ChunkLoadError automatically by reloading the page; further suppression would risk hiding genuine deploy / asset-pipeline regressions. Left active as low-signal monitoring.
  • Generic Server Error with no context (019dc294, 019dc297, 019d9197-ad8e) — 3 / 1 / 1 occurrences. Below the auto-skip threshold (<5 occ AND <2 users); diagnosis not actionable without more context.
  • All issues <5 occ AND <2 users per the task's auto-skip rule.

Deferred — needs human input

None.

Dropped after failed verification

None. Single change passed type-check, lint, and full test suite on first run.

Constraints honored

  • No drive-by refactors.
  • No new dependencies.
  • No schema or auth changes.
  • No any, no weakened types.
  • No --no-verify or hook bypasses.
  • File-size caps respected (smallest possible change).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Provider response failures are now properly detected and handled through existing retry and error recovery mechanisms
    • Network connectivity errors are filtered from error reporting to reduce notification noise
    • Improved error classification for provider-related failures
  • Tests

    • Added test coverage for streaming error scenarios and network failure handling

Review Change Stack

JeffOtano and others added 2 commits May 6, 2026 13:14
Treat provider streams that finish with an error reason as provider failures so BYOK handling can finalize the pending message with a safe user-facing error instead of leaving the generic empty-message fallback.

Add regression coverage for OpenAI Responses stream failures that do not throw provider exceptions.

Co-authored-by: Codex <noreply@openai.com>
Add `Failed to fetch` (Chromium/Safari) and `NetworkError when attempting
to fetch resource` (Firefox) to both posthogBeforeSend and sentryBeforeSend
suppression lists. These TypeErrors are thrown by the browser's fetch()
implementation when it cannot reach the server (offline, DNS, TLS,
connection drops) — they originate before any application code runs and
Convex transport already auto-reconnects with a UI indicator, so they
are not actionable application bugs.

Resolves PostHog issues:
- 019dfc33 NetworkError when attempting to fetch resource (3 occ)
- 019dfa87 Failed to fetch (2 occ)
- 019dd58e Failed to fetch (2 occ)
- 019d4a22 NetworkError ... chatty-hawk-29.convex.cloud (13 occ)
- 019d982a Failed to fetch (1 occ)
- 019da185 Failed to fetch (1 occ)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 9, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b373854a-fd32-403f-8e06-7b2a75b5ac39

📥 Commits

Reviewing files that changed from the base of the PR and between 7c96055 and a981fdd.

📒 Files selected for processing (7)
  • convex/ai/byokErrors.ts
  • convex/ai/resilience.ts
  • convex/ai/resilienceStreamFailure.test.ts
  • src/lib/posthogBeforeSend.test.ts
  • src/lib/posthogBeforeSend.ts
  • src/lib/sentryBeforeSend.test.ts
  • src/lib/sentryBeforeSend.ts

📝 Walkthrough

Walkthrough

This PR adds provider response failure detection in streaming contexts and expands error suppression for browser-side transport failures. Provider streaming completions with finishReason: "error" now throw an error that propagates through retry/circuit-breaker handling and classifies as byok_unknown_error. Browser fetch failures (NetworkError, Failed to fetch) are additionally suppressed in both PostHog and Sentry telemetry.

Changes

Provider Response Failure Detection

Layer / File(s) Summary
Error Classification
convex/ai/byokErrors.ts
classifyByokError recognizes provider_response_failed text and returns byok_unknown_error.
Streaming Detection
convex/ai/resilience.ts
attemptStream checks accumulator.finishReason after streaming; if "error", throws Error("provider_response_failed") to trigger error handling.
Test Coverage
convex/ai/resilienceStreamFailure.test.ts
Test suite verifies streaming failures are detected, classified, and trigger error recording and message mutation.

Browser Transport Error Suppression

Layer / File(s) Summary
PostHog Suppression
src/lib/posthogBeforeSend.ts
SUPPRESSED_MESSAGE_SUBSTRINGS extended with NetworkError when attempting to fetch resource and Failed to fetch.
PostHog Tests
src/lib/posthogBeforeSend.test.ts
Tests verify Firefox and Chromium/Safari fetch failures are dropped.
Sentry Suppression
src/lib/sentryBeforeSend.ts
SUPPRESSED_MESSAGE_SUBSTRINGS extended with browser fetch and network failure patterns.
Sentry Tests
src/lib/sentryBeforeSend.test.ts
Tests assert Sentry drops Firefox and Chromium/Safari transport failures.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • JeffOtano/roni#345: Directly related; implements the same provider_response_failed detection and byok_unknown_error classification in resilience.ts and byokErrors.ts.
  • JeffOtano/roni#317: Modifies the same PostHog and Sentry suppression lists to extend filtering for noisy errors.
  • JeffOtano/roni#246: Related; modifies convex/ai/resilience.ts streaming behavior and error handling logic.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/posthog-batch-2026-05-09
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch fix/posthog-batch-2026-05-09

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a981fdda2b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// to fetch resource"; Chromium-based browsers and Safari emit "Failed to
// fetch".
"NetworkError when attempting to fetch resource",
"Failed to fetch",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Scope fetch-noise suppression to Convex transport

Avoid dropping every "Failed to fetch" / Firefox network error globally: shouldDropPosthogEvent and shouldDropSentryEvent apply these substrings to all client exceptions, so any real outage (bad API host, CORS/TLS misconfig, backend unreachability) will now be silently filtered out of both PostHog and Sentry instead of being observable. Please gate this suppression to known Convex transport contexts (for example by stack/origin/URL metadata) rather than a blanket message match.

Useful? React with 👍 / 👎.

@JeffOtano
Copy link
Copy Markdown
Owner Author

Closing as code-owner triage. The intended network-error suppression is small, but this branch is stale against current main: merge-tree reports an add/add conflict in convex/ai/resilienceStreamFailure.test.ts, and the branch also predates the fast-uri and Gemini thinking fixes that have now landed. Please recreate the network suppression as a clean, focused PR from current main if we still want it.

@JeffOtano JeffOtano closed this May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant